Conversation
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 63 metrics, 8 unstable metrics. Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~2cb195db57, baseline=1.62.0-SNAPSHOT~81e22942d7
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.066 s) : 0, 1065526
Total [baseline] (11.152 s) : 0, 11151916
Agent [candidate] (1.06 s) : 0, 1060201
Total [candidate] (11.093 s) : 0, 11092560
section appsec
Agent [baseline] (1.25 s) : 0, 1250164
Total [baseline] (11.044 s) : 0, 11043649
Agent [candidate] (1.25 s) : 0, 1249852
Total [candidate] (11.156 s) : 0, 11156167
section iast
Agent [baseline] (1.226 s) : 0, 1225831
Total [baseline] (11.356 s) : 0, 11355610
Agent [candidate] (1.225 s) : 0, 1225146
Total [candidate] (11.349 s) : 0, 11349376
section profiling
Agent [baseline] (1.189 s) : 0, 1189176
Total [baseline] (11.076 s) : 0, 11076256
Agent [candidate] (1.184 s) : 0, 1183504
Total [candidate] (11.037 s) : 0, 11036740
gantt
title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~2cb195db57, baseline=1.62.0-SNAPSHOT~81e22942d7
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.254 ms) : 0, 1254
crashtracking [candidate] (1.226 ms) : 0, 1226
BytebuddyAgent [baseline] (638.253 ms) : 0, 638253
BytebuddyAgent [candidate] (634.43 ms) : 0, 634430
AgentMeter [baseline] (29.737 ms) : 0, 29737
AgentMeter [candidate] (29.186 ms) : 0, 29186
GlobalTracer [baseline] (249.414 ms) : 0, 249414
GlobalTracer [candidate] (248.914 ms) : 0, 248914
AppSec [baseline] (32.488 ms) : 0, 32488
AppSec [candidate] (32.418 ms) : 0, 32418
Debugger [baseline] (60.187 ms) : 0, 60187
Debugger [candidate] (60.0 ms) : 0, 60000
Remote Config [baseline] (595.747 µs) : 0, 596
Remote Config [candidate] (599.167 µs) : 0, 599
Telemetry [baseline] (8.124 ms) : 0, 8124
Telemetry [candidate] (8.214 ms) : 0, 8214
Flare Poller [baseline] (9.074 ms) : 0, 9074
Flare Poller [candidate] (8.957 ms) : 0, 8957
section appsec
crashtracking [baseline] (1.24 ms) : 0, 1240
crashtracking [candidate] (1.222 ms) : 0, 1222
BytebuddyAgent [baseline] (663.298 ms) : 0, 663298
BytebuddyAgent [candidate] (662.161 ms) : 0, 662161
AgentMeter [baseline] (12.05 ms) : 0, 12050
AgentMeter [candidate] (12.038 ms) : 0, 12038
GlobalTracer [baseline] (248.846 ms) : 0, 248846
GlobalTracer [candidate] (248.943 ms) : 0, 248943
AppSec [baseline] (185.248 ms) : 0, 185248
AppSec [candidate] (185.333 ms) : 0, 185333
Debugger [baseline] (66.057 ms) : 0, 66057
Debugger [candidate] (66.367 ms) : 0, 66367
Remote Config [baseline] (616.719 µs) : 0, 617
Remote Config [candidate] (629.082 µs) : 0, 629
Telemetry [baseline] (8.315 ms) : 0, 8315
Telemetry [candidate] (8.596 ms) : 0, 8596
Flare Poller [baseline] (3.495 ms) : 0, 3495
Flare Poller [candidate] (3.615 ms) : 0, 3615
IAST [baseline] (24.511 ms) : 0, 24511
IAST [candidate] (24.533 ms) : 0, 24533
section iast
crashtracking [baseline] (1.232 ms) : 0, 1232
crashtracking [candidate] (1.228 ms) : 0, 1228
BytebuddyAgent [baseline] (800.456 ms) : 0, 800456
BytebuddyAgent [candidate] (799.922 ms) : 0, 799922
AgentMeter [baseline] (11.441 ms) : 0, 11441
AgentMeter [candidate] (11.503 ms) : 0, 11503
GlobalTracer [baseline] (239.301 ms) : 0, 239301
GlobalTracer [candidate] (239.598 ms) : 0, 239598
AppSec [baseline] (29.851 ms) : 0, 29851
AppSec [candidate] (29.854 ms) : 0, 29854
Debugger [baseline] (67.853 ms) : 0, 67853
Debugger [candidate] (67.067 ms) : 0, 67067
Remote Config [baseline] (554.163 µs) : 0, 554
Remote Config [candidate] (544.501 µs) : 0, 545
Telemetry [baseline] (9.488 ms) : 0, 9488
Telemetry [candidate] (9.73 ms) : 0, 9730
Flare Poller [baseline] (3.526 ms) : 0, 3526
Flare Poller [candidate] (3.527 ms) : 0, 3527
IAST [baseline] (25.862 ms) : 0, 25862
IAST [candidate] (25.887 ms) : 0, 25887
section profiling
crashtracking [baseline] (1.184 ms) : 0, 1184
crashtracking [candidate] (1.195 ms) : 0, 1195
BytebuddyAgent [baseline] (694.368 ms) : 0, 694368
BytebuddyAgent [candidate] (690.735 ms) : 0, 690735
AgentMeter [baseline] (9.132 ms) : 0, 9132
AgentMeter [candidate] (9.109 ms) : 0, 9109
GlobalTracer [baseline] (207.814 ms) : 0, 207814
GlobalTracer [candidate] (207.116 ms) : 0, 207116
AppSec [baseline] (32.891 ms) : 0, 32891
AppSec [candidate] (32.669 ms) : 0, 32669
Debugger [baseline] (65.791 ms) : 0, 65791
Debugger [candidate] (65.357 ms) : 0, 65357
Remote Config [baseline] (584.748 µs) : 0, 585
Remote Config [candidate] (601.316 µs) : 0, 601
Telemetry [baseline] (7.821 ms) : 0, 7821
Telemetry [candidate] (7.787 ms) : 0, 7787
Flare Poller [baseline] (3.55 ms) : 0, 3550
Flare Poller [candidate] (3.575 ms) : 0, 3575
ProfilingAgent [baseline] (94.445 ms) : 0, 94445
ProfilingAgent [candidate] (93.78 ms) : 0, 93780
Profiling [baseline] (95.003 ms) : 0, 95003
Profiling [candidate] (94.35 ms) : 0, 94350
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~2cb195db57, baseline=1.62.0-SNAPSHOT~81e22942d7
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.069 s) : 0, 1069184
Total [baseline] (8.869 s) : 0, 8868813
Agent [candidate] (1.059 s) : 0, 1058569
Total [candidate] (8.85 s) : 0, 8849998
section iast
Agent [baseline] (1.234 s) : 0, 1233984
Total [baseline] (9.585 s) : 0, 9584940
Agent [candidate] (1.234 s) : 0, 1234267
Total [candidate] (9.602 s) : 0, 9601736
gantt
title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~2cb195db57, baseline=1.62.0-SNAPSHOT~81e22942d7
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.249 ms) : 0, 1249
crashtracking [candidate] (1.24 ms) : 0, 1240
BytebuddyAgent [baseline] (640.401 ms) : 0, 640401
BytebuddyAgent [candidate] (633.946 ms) : 0, 633946
AgentMeter [baseline] (29.889 ms) : 0, 29889
AgentMeter [candidate] (29.243 ms) : 0, 29243
GlobalTracer [baseline] (252.049 ms) : 0, 252049
GlobalTracer [candidate] (249.12 ms) : 0, 249120
AppSec [baseline] (32.849 ms) : 0, 32849
AppSec [candidate] (32.455 ms) : 0, 32455
Debugger [baseline] (59.991 ms) : 0, 59991
Debugger [candidate] (59.114 ms) : 0, 59114
Remote Config [baseline] (604.675 µs) : 0, 605
Remote Config [candidate] (611.125 µs) : 0, 611
Telemetry [baseline] (8.148 ms) : 0, 8148
Telemetry [candidate] (8.215 ms) : 0, 8215
Flare Poller [baseline] (7.55 ms) : 0, 7550
Flare Poller [candidate] (8.293 ms) : 0, 8293
section iast
crashtracking [baseline] (1.252 ms) : 0, 1252
crashtracking [candidate] (1.251 ms) : 0, 1251
BytebuddyAgent [baseline] (808.297 ms) : 0, 808297
BytebuddyAgent [candidate] (806.812 ms) : 0, 806812
AgentMeter [baseline] (11.529 ms) : 0, 11529
AgentMeter [candidate] (11.577 ms) : 0, 11577
GlobalTracer [baseline] (240.499 ms) : 0, 240499
GlobalTracer [candidate] (241.861 ms) : 0, 241861
AppSec [baseline] (31.596 ms) : 0, 31596
AppSec [candidate] (31.798 ms) : 0, 31798
Debugger [baseline] (64.876 ms) : 0, 64876
Debugger [candidate] (64.816 ms) : 0, 64816
Remote Config [baseline] (547.515 µs) : 0, 548
Remote Config [candidate] (545.98 µs) : 0, 546
Telemetry [baseline] (9.5 ms) : 0, 9500
Telemetry [candidate] (9.399 ms) : 0, 9399
Flare Poller [baseline] (3.521 ms) : 0, 3521
Flare Poller [candidate] (3.543 ms) : 0, 3543
IAST [baseline] (25.986 ms) : 0, 25986
IAST [candidate] (26.231 ms) : 0, 26231
LoadParameters
See matching parameters
SummaryFound 2 performance improvements and 1 performance regressions! Performance is the same for 17 metrics, 16 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~2cb195db57, baseline=1.62.0-SNAPSHOT~81e22942d7
dateFormat X
axisFormat %s
section baseline
no_agent (19.085 ms) : 18891, 19279
. : milestone, 19085,
appsec (20.071 ms) : 19864, 20277
. : milestone, 20071,
code_origins (17.888 ms) : 17710, 18065
. : milestone, 17888,
iast (17.79 ms) : 17610, 17971
. : milestone, 17790,
profiling (18.261 ms) : 18081, 18440
. : milestone, 18261,
tracing (17.981 ms) : 17805, 18157
. : milestone, 17981,
section candidate
no_agent (19.087 ms) : 18898, 19277
. : milestone, 19087,
appsec (18.707 ms) : 18519, 18895
. : milestone, 18707,
code_origins (18.119 ms) : 17941, 18298
. : milestone, 18119,
iast (17.772 ms) : 17594, 17949
. : milestone, 17772,
profiling (18.295 ms) : 18116, 18475
. : milestone, 18295,
tracing (17.979 ms) : 17800, 18157
. : milestone, 17979,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~2cb195db57, baseline=1.62.0-SNAPSHOT~81e22942d7
dateFormat X
axisFormat %s
section baseline
no_agent (1.265 ms) : 1252, 1277
. : milestone, 1265,
iast (3.221 ms) : 3180, 3263
. : milestone, 3221,
iast_FULL (6.044 ms) : 5983, 6106
. : milestone, 6044,
iast_GLOBAL (3.639 ms) : 3576, 3701
. : milestone, 3639,
profiling (2.219 ms) : 2199, 2239
. : milestone, 2219,
tracing (1.857 ms) : 1842, 1872
. : milestone, 1857,
section candidate
no_agent (1.248 ms) : 1236, 1260
. : milestone, 1248,
iast (3.378 ms) : 3328, 3428
. : milestone, 3378,
iast_FULL (6.195 ms) : 6130, 6260
. : milestone, 6195,
iast_GLOBAL (3.727 ms) : 3666, 3788
. : milestone, 3727,
profiling (2.22 ms) : 2200, 2240
. : milestone, 2220,
tracing (1.882 ms) : 1864, 1900
. : milestone, 1882,
DacapoParameters
See matching parameters
SummaryFound 1 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 1 unstable metrics.
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~2cb195db57, baseline=1.62.0-SNAPSHOT~81e22942d7
dateFormat X
axisFormat %s
section baseline
no_agent (15.2 s) : 15200000, 15200000
. : milestone, 15200000,
appsec (14.714 s) : 14714000, 14714000
. : milestone, 14714000,
iast (18.729 s) : 18729000, 18729000
. : milestone, 18729000,
iast_GLOBAL (18.208 s) : 18208000, 18208000
. : milestone, 18208000,
profiling (15.144 s) : 15144000, 15144000
. : milestone, 15144000,
tracing (14.909 s) : 14909000, 14909000
. : milestone, 14909000,
section candidate
no_agent (15.292 s) : 15292000, 15292000
. : milestone, 15292000,
appsec (14.926 s) : 14926000, 14926000
. : milestone, 14926000,
iast (18.338 s) : 18338000, 18338000
. : milestone, 18338000,
iast_GLOBAL (17.931 s) : 17931000, 17931000
. : milestone, 17931000,
profiling (15.212 s) : 15212000, 15212000
. : milestone, 15212000,
tracing (14.932 s) : 14932000, 14932000
. : milestone, 14932000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~2cb195db57, baseline=1.62.0-SNAPSHOT~81e22942d7
dateFormat X
axisFormat %s
section baseline
no_agent (1.491 ms) : 1480, 1503
. : milestone, 1491,
appsec (3.824 ms) : 3601, 4046
. : milestone, 3824,
iast (2.289 ms) : 2219, 2359
. : milestone, 2289,
iast_GLOBAL (2.323 ms) : 2253, 2393
. : milestone, 2323,
profiling (2.539 ms) : 2375, 2704
. : milestone, 2539,
tracing (2.089 ms) : 2035, 2144
. : milestone, 2089,
section candidate
no_agent (1.491 ms) : 1479, 1502
. : milestone, 1491,
appsec (2.54 ms) : 2485, 2595
. : milestone, 2540,
iast (2.278 ms) : 2208, 2348
. : milestone, 2278,
iast_GLOBAL (2.333 ms) : 2263, 2403
. : milestone, 2333,
profiling (2.122 ms) : 2066, 2178
. : milestone, 2122,
tracing (2.086 ms) : 2032, 2140
. : milestone, 2086,
|
# Conflicts: # dd-trace-core/src/main/java/datadog/trace/core/DDSpan.java # dd-trace-core/src/main/java/datadog/trace/core/DDSpanContext.java
|
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
|
The following files add Groovy tests to modules that are candidates for migration to Java / JUnit 5:
Consider writing these tests in Java / JUnit 5 instead to help with the ongoing migration effort. |
|
Besides just reviewing the code, I wanted to see how this change impacts our overhead.
I suspect throughput will improve once this branch is synchronized with master. The cache will just help a little more by de-duplicating across payloads. |
Yep, totally agree on caching, we can improve in follow up PRs. |
| if (DDTags.SPAN_EVENTS.equals(entry.tag())) { | ||
| continue; | ||
| } | ||
| writeFlattenedTagAttribute(writable, entry.tag(), entry.objectValue()); |
There was a problem hiding this comment.
I'd like to try to avoid boxing primitives here.
I think making a variant of writeFlattenedTagAttribute that takes writable, entryReader might make sense.
Then we check if the TagMap.EntryReader is holding a primitive before calling objectValue.
There was a problem hiding this comment.
Refactored.
| static final int SPAN_KIND_CONSUMER = 5; | ||
|
|
||
| // Decision maker tag key | ||
| private static final String KEY_DECISION_MAKER = "_dd.p.dm"; |
There was a problem hiding this comment.
We probably want to use Utf8BytesString for these. It caches the UTF8 representation, so we don't keep paying the encoding cost.
Admittedly, I didn't double check where KEY_DECISION_MAKER is used, so I'll let you make the final decision on that.
There was a problem hiding this comment.
I double checked the code and I can see that KEY_DECISION_MAKER used for lookup only, so not sure how UTF8 can be applied here. I think the we can do one more round of optimizations for UTF8 cached values in follow up PRs, but probably v1 should be good as-is as it is maintaining string table with all unique strings already, that can be considered as a cache already.
| writable.writeInt(fieldId); | ||
| writable.startArray(spansCount); | ||
|
|
||
| for (int i = 0; i < spansCount; i++) { |
There was a problem hiding this comment.
Is the c-style for loop, so we can access the index?
As an alternative, we could use a ListIterator.
I've typically found Iterators to be better than c-style for loops, but I'll admit it can be situational. We probably just need to benchmark with v1 and see.
I can share my stress test if you like.
There was a problem hiding this comment.
Refactored.
| /** | ||
| * @return High-order and low-order bits as bytes array. | ||
| */ | ||
| public byte[] to128BitBytes() { |
There was a problem hiding this comment.
On second look (along with Claude), I wonder if this is the source of the extra byte[] allocation.
I'd thought about whether we could optimize this, but stopped short of proposing doing so previously.
NOTE: I'm also fine with doing this optimization later.
But this is the approach Claude proposed...
- DDTraceId.to128BitBytes() allocates two objects per span and per link — DDTraceId.java:105-113
ByteBuffer buffer = ByteBuffer.allocate(16);
buffer.putLong(toHighOrderLong());
buffer.putLong(toLong());
return buffer.array();
Called once per span (TraceMapperV1.java:1264) and once per span link (:1351). The ByteBuffer wrapper is overhead — ByteBuffer.allocate allocates both the
buffer object and the backing byte[]. The returned byte[] is then handed to writable.writeBinary(bytes) and discarded. Two better options:
- Use byte[] bytes = new byte[16]; ByteBuffer.wrap(bytes).putLong(hi).putLong(lo); return bytes; — halves allocation.
- Better: add Writable.writeBinaryTraceId(long hi, long lo) so the mapper writes 16 bytes directly into the existing target buffer with zero allocation.
dougqh
left a comment
There was a problem hiding this comment.
One other performance suggestion from Claude...
- StringTable lookups force String materialization and Integer boxing — TraceMapperV1.java:1695-1706, :1832-1866
Every tag name, tag value, service name, operation name, resource, component, etc. goes through:
String str = value == null ? "" : value.toString();
Integer index = stringTable.get(str);
Two issues:
- value.toString() allocates a new String for every UTF8BytesString / non-String CharSequence. Given V1 is a streaming string table design intended to amortize
string cost, this re-materializes on every lookup and defeats part of the win.- HashMap<String,Integer>.get() autoboxes the returned index.
Consider a CharSequence-keyed open-addressed primitive int map (dd-trace already has UTF8ByteStringMap-like structures; verify). Even dropping to Eclipse
Collections' ObjectIntHashMap gets rid of boxing.
I'm guess StringTable is pre-existing. I agree with Claude that it would be nice to reduce the allocation especially the boxing.
But again, I'm fine with leaving this to a later PR.
dougqh
left a comment
There was a problem hiding this comment.
Looks good to me.
I added a couple comments based on asking Claude to do a perf review.
I do think optimizing might the ID to byte conversion would be good.
I suspect that is where some (most?) of the extra 2GiB of byte[] were coming from.
But I'm fine with doing that in a separate pull request
What Does This Do
Java implementation for protocol v1.
Motivation
Additional Notes
DO NOT MERGE, WIP.